Are audio or textual training data more important for ASR in less-represented languages?

نویسندگان

  • Thomas Pellegrini
  • Lori Lamel
چکیده

State-of-the-Art speech recognizers are typically trained on very large amounts of data, both transcribed speech and texts. With the recent growing interest in developing speech technologies for languages for which only small amounts of data are accessible, collecting appropriate data is a key issue in building new speech recognition systems. This article reports on an experimental study assessing the performance of a speech recognizer for a less-represented language, as a function of the quantity of texts and transcribed speech data available for model training. The experimental results show that for supervised training with only 2 hours of manually transcribed data, the acoustic models are the weak point. With 10 hours or more of transcribed audio data, the quantity of texts has a larger affect on the error rate than the quantity of speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Novice Raters’ Textual Considerations in Independent and Negotiated Ratings

Educators often employ various training techniques to reduce raters’ subjectivity. Negotiation is a technique which can assist novice raters to co-construct a shared understanding of the writing assessment when rating collaboratively. There is little research, however, on rating behaviors of novice raters while employing negotiation techniques and the effect of negotiation on their understandin...

متن کامل

Speech alignment and recognition experiments for Luxembourgish

Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual ac...

متن کامل

مقایسه روش‌های مختلف یادگیری ماشین در خلاصه‌سازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت

In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...

متن کامل

Wiki-like Editing of Imperfect Computer-Generated Webcast Transcripts

As the use of Internet broadcasting (webcasting) increases, more webcasts will be archived and accessed numerous times retrospectively. One challenge in skimming and browsing through such archives is the lack of textual transcripts of the archived medias’ audio channel. Ideally, transcripts would be obtainable through Automatic Speech Recognition (ASR). However, current ASR systems can only del...

متن کامل

Automatic transcription of Somali language

Most African countries follow an oral tradition system to transmit their cultural, scientific and historic heritage through generations. This ancestral knowledge accumulated during centuries is today threatened of disappearing. Automatic transcription and indexing tools seem potential solution to preserve it. This paper presents the first steps of automatic speech recognition (ASR) of Djibouti ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008